Expectation Maximization Algorithm for Domain Specific Ontology Extraction

نویسندگان

  • Brijesh Bhatt
  • Pushpak Bhattacharyya
چکیده

Learning ontology from unstructured text is a challenging task. Over the years, a lot of research has been done to predict ontological relation between a pair of concepts. However all these measures predict relation with a varying degree of accuracy. There has also been work on learning ontology by combining evidences from heterogeneous sources, but most of these algorithms are ad hoc in nature. In this paper we investigate wide range of evidences to predict relation between a pair of concepts and propose a standardized Expectation Maximization algorithm to construct domain specific ontology. The proposed approach is completely unsupervised and does not require any seed terms or human intervention. In addition, the proposed approach can also be easily adopted for any language. We have conducted our experiments for two languages Hindi and English and for two domains Health and Tourism. The average F-Score observed in all experiments is above 0.60.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Ontology Alignment Using Multiple Contexts

Ontology alignment involves determining the semantic heterogeneity between two or more domain specifications by considering their associated concepts. Our approach considers name, structural and content matching techniques for aligning ontologies. After comparing the ontologies using concept names, we examine the instance data of the compared concepts and perform content matching using value ty...

متن کامل

Quantitative SPECT and planar 32P bremsstrahlung imaging for dosimetry purpose –An experimental phantom study

Background: In this study, Quantitative 32P bremsstrahlung planar and SPECT imaging and consequent dose assessment were carried out as a comprehensive phantom study to define an appropriate method for accurate Dosimetry in clinical practice. Materials and Methods: CT, planar and SPECT bremsstrahlung images of Jaszczak phantom containing a known activity of 32P were acquired. In addition, Phanto...

متن کامل

A Fuzzy C-Means based GMM for Classifying Speech and Music Signals

Gaussian Mixture Model (GMM) with Fuzzy c-means attempts to classify signals into speech and music. Feature extraction is done before classification. The classification accuracy mainly relays on the strength of the feature extraction techniques. Simple audio features such as Time domain and Frequency domain are adopted. The time domain features are Zero Crossing Rate (ZCR) and Short Time Energy...

متن کامل

A Statistical Approch for Image Feature Extraction in the Wavelet Domain

In this pape,: a new image feature extraction method based on the statistical analysis in the wavelet domain is developed for content-based image retrieval (CBIR). A !WO component Gaussian mixture model is developed to describe the statitistical characteristics of images in the wavelet domain. The model parameters are obtained by an EM (Expectation-Maximization) algorithm and then employed to c...

متن کامل

Temporal Relation Extraction Using Expectation Maximization

The ability to accurately determine temporal relations between events is an important task for several natural language processing applications such as Question Answering, Summarization, and Information Extraction. Since current supervised methods require large corpora, which for many languages do not exist, we have focused our attention on approaches with less supervision as much as possible. ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Research in Computing Science

دوره 90  شماره 

صفحات  -

تاریخ انتشار 2015